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Abstract 

The proof of the Heisenberg uncertainty relation is modified to 
produce two improvements: (a) the resulting inequality is stronger 
because it includes the covariance between the two observables, and 
(b) the proof lifts certain restrictions on the state to which the rela- 
tion is applied, increasing its generality. The restrictions necessary for 
the standard inequality to apply are not widely known, and they are 
discussed in detail. The classical analog of the Heisenberg relation is 
also derived, and the two are compared. Finally, the modified relation 
is used to address the apparent paradox that eigenfunctions of the z 
component of angular momentum L z do not satisfy the <j) — L z Heisen- 
berg relation; the resolution is that the restrictions mentioned above 
make the usual inequality inapplicable to these states. The modified 
relation does apply, however, and it is shown to be consistent with 
explicit calculations. 

I. INTRODUCTION 

The Heisenberg uncertainty relation in its general form for observables A 
and B, 

AAAB>±\i([A,B])\, (1) 
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is proved in every intermediate quantum mechanics textbook (and also in 
the Appendix); its best known special case, Ax Ap > |, comes from the 
canonical commutation relation [x,p] = %h. A very slight modification of a 
standard proof of this inequality used by both Bohm 1 and Sakurai 2 yields 
two useful improvements: 

1. The resulting inequality is a stronger one that incorporates the co- 
variance between A and B, a measure of their statistical correlation. 
As a bonus, this allows a comparison with the corresponding classical 
inequality, in which the covariance also appears. 

2. This result lifts certain restrictions that must be imposed on the state 
of the system for the standard Heisenberg inequality to be valid. These 
restrictions are not generally mentioned in textbooks, but you ignore 
them at your peril. For example, the z component of angular mo- 
mentum L z and the azimuthal angle <p form a canonical pair, so from 
[0, L z ] = ih one expects to find A(j) AL Z > |. However, consider the 
state 

This is an eigenstate of L z , so AL Z = 0, and a quick calculation yields 

A4> = so 

A0AL, = O<| (3) 

What went wrong? This example has produced a flurry of commentary 
over the years 3-8 , and its resolution lies in the surprising fact that 
eigenstates of L z do not satisfy the criteria necessary for the standard 
Heisenberg principle to apply. I will describe these criteria in detail 
below, as well as why eigenstates of L z do not satisfy them, and once 
I have derived the modified inequality I will show that it is consistent 
with this example. 

The extension to include the covariance is not new 9-11 (in fact, it was known 
to Schrodinger 12 and has been discussed before in this journal 13 ), nor is the 
modification that removes certain restrictions on the states 14,15 . However, 
the proof presented here yields both improvements simultaneously with great 
ease, and the two together allow one to discuss issues that make it clear that 
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quantum mechanics is not a straightforward generalization of classical statis- 
tics, even once one has taken into account the noncommutivity of observ- 
ables. Certain uniquely quantum mechanical concerns require that even the 
definitions of statistical quantities be made with will be shown below. 

II. THE CLASSICAL UNCERTAINTY RELATION 

Since the modified inequality allows me to compare the Heisenberg re- 
lation with its classical counterpart, I will derive the classical relation first. 
(This relation is also derived in Ref. 13.) 

Let a be a classical statistical variable with mean (a) and uncertainty Act 
defined by 

(Aa) 2 = ((a-(a)) 2 ) = (a 2 )-(a) 2 , (4) 
and let a ab , the covariance between variables a and 6, be defined by 

v ab = ({a-(a))(b-(b))) = (ab) - (a)(b). (5) 

Notice that (Aa) 2 = a aa and that a and b are statistically uncorrelated if 
and only if a ab = 0. I define a new variable a by a = a — (a) and similarly 
for 6; then (a) = (b) = and 

(Aa) 2 = (a 2 ) and a ab = (ab). (6) 

Now I can prove the uncertainty relation. Let x be any statistical variable; 
then (x 2 ) > and (x 2 ) = if and only if x = 0. Then for the special case 
x = a + Xb for any A I have 

( x 2 ) = (a 2 ) + \ 2 (b 2 } + 2X(ab) > (7) 

with equality if and only if a + Xb = 0. The central expression above is a 
quadratic in A which according to the inequality has at most one real root 
(if it had two then it would dip below the A-axis and be negative). The 
condition for the quadratic Ax 2 + Bx + C to have at most one real root is 
B 2 — AAC < 0, with equality in the case of exactly one root. In this case the 
condition becomes 

A(ab) 2 -A(a 2 )(b 2 ) < 0, (8) 

or in terms of ©, 

(Aa) 2 (Ao) 2 > (a ab ) 2 

AaAb > \a ab \, (9) 
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with equality if and only if a + Xb = for some A. This is the uncertainty 
principle for classical statistics. 

III. THE MODIFIED HEISENBERG RELATION 

Now I shall derive the corresponding quantum mechanical result. Let A 
and B be observables, and let states be denoted by ip, x, and so on. The inner 
product of states tp and % is denoted (ip,x), an d the norm ||-0|| is defined 
by HV 7 !! = \J '0)- Finally, the average of A is defined by (A) = (ip,Aip). 
(I deliberately avoid Dirac's (i(j\A\i(j) because it obscures an important issue; 
see below.) 

The quantum mechanical derivation cannot simply recapitulate the clas- 
sical derivation with the appropriate letters capitalized for two reasons: 

1. A and B might not commute. 

Because of this, the order of the factors in the cross term in the ex- 
pansion of (x 2 ) should be preserved. The problem of noncommutivity 
actually rears its head earlier, however, in the very definition of covari- 
ance, and I must address that issue first. The classical definition of 
covariance is symmetric in a and b (a a b = <Jba) because a and b always 
commute, but if I employed the same definition in the quantum case I 
would find oab = <?ba + ([A, B]). A covariance symmetric in A and B 
is preferable, and the easiest way to achieve this is to define 

a AB = ±((A-(A))(B-(B)) + (B-(B))(A-(A))) 

= 1 -{AB + BA)-{A){B). (10) 

Now aAB = &BA and a a a has the same form as before, but this defi- 
nition suffers from another awkward feature that leads to the second 
point. 

2. The domains of operators matter. 

The domain of an operator A, or T>(A), is the set of all vectors -0 in 
the system's Hilbert space such that Aip is also a well-defined member 
of the Hilbert space. (For more on operators with restricted domains, 
see Refs. 16, 17, and 18. For some of the consequences for quantum 
mechanics, see Ref. 19.) There are three main reasons that a given ip 
might not be in T>(A): 
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(a) The operating prescription for A is not defined for ip. For exam- 
ple, consider the Hilbert space L 2 (R) and the momentum operator 
P = f 1~- ^ necessar y condition for pip to exist is that ip is dif- 
ferentiable almost everywhere (being defined almost everywhere 
is enough to specify a member of L 2 (R)); but to be in L 2 (R) a 
function merely has to be square integrable, which does not imply 
differentiability or even continuity. This restriction, though real, 
is of little practical interest, however, since it is exceedingly rare 
in applications to encounter this problem. 

(b) The operating prescription is well-defined, but the resulting vector 
is not in the Hilbert space. For example, again consider L 2 (R) and 



the momentum operator p, and this time let ip(x) = y2\x\e l x L 

Now this ip is in L 2 (R) because it is square integrable (in fact, it 
is normalized), but its derivative 



while well-defined everywhere except the origin, is not square in- 
tegrable. Hence ip' is not in L 2 (R), so ip is not in T>(p). (It is 
known that T>(p) is dense 20 in L 2 (R), so any L 2 function is arbi- 
trarily close to a function in T>(p), and this fact is important for 
quantum mechanics. Nonetheless, V(p) is not the whole Hilbert 
space.) 

(c) Sometimes V(A) is restricted to guarantee that A will be Hermi- 
tian. For example, consider the space of L 2 functions of the polar 
angle <p an d the operator L z = For any two functions ip and 
X, integration by parts shows that 



( X , L z iP) = (L zX , V>) + -t[x*(2tt) </,(2tt) - X *(0) ^(0)]. (12) 



Thus L z is Hermitian only if its domain is restricted to functions ip 
such that ip{2n) = e ia ip(0) for some a (note that strict periodicity 
is not required). As innocent as this seems, this is the source of 
all of the problems we encountered above with the usual form of 
the — L z uncertainty relation, as I will show below. 





(11) 



h 
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This issue is the reason that I avoid Dirac's notation (x|yl|'?/>); that 
expression could mean either (x, At/}), which requires that ip is i n D{A) 
but leaves x unrestricted, or (Ax,ifi) {A is Hermitian), which reverses 
the restrictions on x and -0. The notation used here, on the other hand, 
is unambiguous. In the derivation of the uncertainty principle, I must 
keep track of all of the domain requirements imposed on the states in 
the proof at each step, because the final result will apply only to those 
states that satisfy all of the restrictions encountered at every step. 

With these concerns in mind, I will now consider the quantum mechanical 
definitions of A A and (Tab- One usually defines A A by 

(AA) 2 = (V, (A - (A)) 2 ip) = (A 2 ) - (A) 2 , (13) 

but notice that this expression is defined only for those states that lie in 
V(A 2 ). (Membership in V(A) is a prerequisite for membership in T>(A 2 ).) 
Now I would certainly like AA to be defined for every state for which (A) is 
defined, so I'd like AA to exist for every state in T>(A). The easiest way to 
do this is to note that by the Hermiticity of A, for all states for which the 
above definition is valid it is equivalent to 

(AA) 2 = ((A - (A))ip, (A - (A))ip) = \\(A-(A)W, (14) 

and this expression is defined for every state in T>(A). Hence I take Eq. (J14)) . 
not Eq. (|T3*j) . to be my definition for AA. Remember that it is equivalent to 
the old definition whenever the old definition is valid, but the old definition 
is not valid in every case where I would like it to be. 
Now on to oab- The definition suggested above, 

a AB = l(^[(A-(A})(B-(B)) + (B-(B))(A-(A})]^) 

= ^,(AB + BA)^)-(A)(B), (15) 

requires that both ABifj and BAifj exist, or that ip is in both T>(AB) and 
V(BA). However, I would prefer a definition of o^b that made only the 
weaker requirement that ij) is in both T>(A) and D(B), not least because I 
want to relate gab to AA and AB, and the weaker requirement is all that is 
needed to guarantee their existence. Fortunately, this is easy; the Hermiticity 
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of A and B allows me to rewrite the above as 

°ab = 1{{A-{A))MB-(BM + \({B-(B))MA-(AM 
= Re((A - (A))ip, (B - (B))ip) 

= Re(Aif),Bif))-(A)(B), (16) 

and this definition is valid on the larger set of states that belong to both 
T>(A) and T>(B), exactly as desired. Hence I take Eq. (fTB^l. not Eq. (JT5Jl . 
as the definition of covariance. Again, the two expressions are equivalent 
whenever both are defined, but the first does not exist in every case where 
I would like it to be, whereas the second does. Finally, in analogy with the 
classical case I define A = A — (A), in terms of which 

AA = \\Aif)\\ and a AB = Re(Aip,Bip). (17) 

Note that (A A) 2 = <jaa, just as in the classical case. 

Now for the uncertainty relation. The Cauchy-Schwarz inequality says 
that for any states if) and x, 

l(x^}|< llxll (is) 

Then, using Eq. (fT7|). 

AAAB = \\Aif)\\\\Bi)\\ 

> \(Aif),Bif))\ 

= y/(Re(Aifj, Bif))) 2 + (lm{Aif;, Bif;}) 2 

= yJal B + (lm(Aif),BiP)) 2 . (19) 

A little algebra shows that Im(Aif), Bif)) = lm{Aif), Bif)), so the final result is 

AAAB > ^a 2 AB + (lm(Aif),BiP)) 2 . (20) 
This is the modified Heisenberg uncertainty relation. 

IV. COMMENTS 

First, note that all of the steps leading to Eq. (j2U|l are valid as long 
as if) lies in both T>(A) and T>(B), and consequently so is the final result. 
Therefore, unlike the usual form of the Heisenberg relation, this inequality is 
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guaranteed to hold in all circumstances in which the quantities involved (the 
uncertainties and covariances) are well-defined; there are no more unpleasant 
surprises waiting to be discovered. 

Next, I shall recover the uncertainty relation with which we are familiar. 
If ip lies in both T>(AB) and D(BA), then the following manipulations are 
allowed: 

lm(Ai;,Bij) = - l -(Ai;,Bij)+ l -(B<i{j,Aij) 
= —(i/jjABtp) + ^(ip,BAifj) 
= - l -^,(AB-BA)^) 
= ~([A,B]). (21) 
Thus when this additional condition is satisfied, 



AAAB > y o\ B + B])) 2 , (22) 

which implies the standard Heisenberg inequality. 

Comparing Eq. (JSJ) with either (|2Ti|) or (j2*2*|) , we see that the sole difference 
introduced by quantum mechanics is the term lm(Aip, Bip), which on a fairly 
large class of states is essentially half the expectation value of i times the 
commutator [A, B] . This is the irreducible indeterminacy present even in 
states where the two observables are entirely independent statistically. 

Now I can reconsider the example of the 4> — L z uncertainty relation dis- 
cussed at the beginning. For the commutator form of the inequality to apply, 
ip must lie in the domains of both cf) L z and L z 0, and ip = (2tc)~ 1 ^ 2 exp(im(j)) 
does not satisfy the latter criterion. If it did, then that would mean that 
<j)if) would be in the domain of L z , but as I noted earlier every state in the 
domain of L z must satisfy ifj(2n) = e ia ifj(0), and 

0V#) = ~j=e^ (23) 

V Z7T 



vanishes at = and is nonvanishing at cf> = 2n. Hence (L z (p)tp does not 
exist, and the commutator inequality does not apply. However, Eq. (|2Uj) does 
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apply, and to find it for this special case I calculate 



i i 
Im(<^,L^) = --((j)ip,L z i(}) + -(L z i/;,<p'4)) 

h f 2n i i* j / h f 2 ^ dip 



11 f A I* d V AA h f W A I AA 



2 Jo 
h r 2n d 



z z Jo 

U 'l-27r|VX27r)| 2 ). (24) 



Thus 



n 2 



A0 AL Z > ^a\ Lz + - (1 - 2^1^(2^)12)^. (25) 

For the particular ijj in question, cr^ z =0 (again because ijj is an eigenstate 
of L z ) and \ip{2n)\ 2 = {27r)-\ so 

A(pAL z >0, (26) 

which is consistent with what we found at the beginning. 

Incidentally, if one carried out an analogous derivation with x and p in 
place of and L z , one would find 

lm(xi/>,p</>) = ~ (l - N>*V]-oo) , (27) 

so the usual Heisenberg inequality for x and p is valid as long as ij) falls 
off faster than |x| -1 / 2 as \x\ — > 00. Since ip is different iable almost every- 
where it must fall off smoothly, in which case square integrability imposes the 
above requirement automatically. Hence the standard form of the Heisenberg 
inequality is always valid for x and p. It is precisely the fact that the coor- 
dinate (f) is bounded while x is unbounded that allows the sorts of problems 
considered in this paper to crop up often in one case and not at all in the 
other. 

One final note is in order concerning the — L z inequality. In its current 
form, Eq. (J25|) . the inequality is not invariant under rotations, as one would 
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prefer, since the direction corresponding to (j) = has no physical significance. 
(The fact that one must choose a (j) = direction just to define 4> is the source 
of the problem.) Hence the <fi — L z inequality has still not been brought to a 
quite satisfactory form; to finish the job, one must develop rotation-invariant 
definitions of uncertainty and repeat the proof, which has been done in Ref. 4. 

APPENDIX: ANOTHER STANDARD PROOF OF THE HEISEN- 
BERG RELATION 

This proof of the uncertainty relation is found, for example, in Ref. 21. 
Let A and B be observables, let ip be a state in both T>(AB) and T>(BA) 
(and thus in T>(A) and T>(B)), and let A and B be defined as earlier. Then 
for any real A 

\\(A + iAB)iP\\ 2 > 
{^,{A-i\B)(A + i\B)^) > 
(tp, (A 2 + X 2 B 2 + i\[AB - BA])ip) > 

(AA) 2 + \ 2 (AB) 2 + i\{[A,B}} > 0, (28) 

where the last line used the standard quantum mechanical definition of uncer- 
tainty and the fact that [A, B] = [A, B\. The commutator of two observables 
is anti-Hermitian, so the quantity i([A, B]) is real. Again we have a quadratic 
in A with at most one real root, so the same condition as mentioned in the 
text yields 

(i({A,B])) 2 -4(AA) 2 (AB) 2 < 0, (29) 

or 

AAAB>±\i([A,B]}\. (30) 

This is the standard Heisenberg uncertainty relation. This result can be 
strengthened by replacing iX with Xe l9 , treating A as before, and taking the 
maximum over all 9; the result is Eq. (|2*2*jl . If one modifies this derivation to 
take into account the new definitions of AA and oab-, Eq- (fTTj) . one recovers 
the main result of this paper, Eq. (J20|) . The derivation in Sec. Ill is much 
shorter, however. 
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